notebooks/tutorials/bigquery/BigQuery command-line tool.ipynb (209 lines of code) (raw):

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# BigQuery command-line tool\n", "\n", "The BigQuery command-line tool is installed as part of the [Cloud SDK](https://cloud-dot-devsite.googleplex.com/sdk/docs/) and can be used to interact with BigQuery. When you use CLI commands in a notebook, the command must be prepended with a `!`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## View available commands\n", "\n", "To view the available commands for the BigQuery command-line tool, use the `help` command." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!bq help" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a new dataset\n", "\n", "A dataset is contained within a specific [project](https://cloud.google.com/bigquery/docs/projects). Datasets are top-level containers that are used to organize and control access to your [tables](https://cloud.google.com/bigquery/docs/tables) and [views](https://cloud.google.com/bigquery/docs/views). A table or view must belong to a dataset. You need to create at least one dataset before [loading data into BigQuery](https://cloud.google.com/bigquery/loading-data-into-bigquery).\n", "\n", "First, name your new dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset_id = \"your_new_dataset\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following command creates a new dataset in the US using the ID defined above.\n", "\n", "NOTE: In the examples in this notebook, the `dataset_id` variable is referenced in the commands using both `{}` and `$`. To avoid creating and using variables, replace these interpolated variables with literal values and remove the `{}` and `$` characters." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!bq --location=US mk --dataset $dataset_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The response should look like the following:\n", "\n", "```\n", "Dataset 'your-project-id:your_new_dataset' successfully created.\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## List datasets\n", "\n", "The following command lists all datasets in your default project." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!bq ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The response should look like the following:\n", "\n", "```\n", " datasetId \n", " ------------------------------ \n", " your_new_dataset \n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load data from a local file to a table\n", "\n", "The following example demonstrates how to load a local CSV file into a new or existing table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Loading Data into BigQuery from a local data source](https://cloud.google.com/bigquery/docs/loading-data-local) in the BigQuery documentation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!bq \\\n", " --location=US \\\n", " load \\\n", " --autodetect \\\n", " --skip_leading_rows=1 \\\n", " --source_format=CSV \\\n", " {dataset_id}.us_states_local_file \\\n", " 'resources/us-states.csv'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load data from Cloud Storage to a table\n", "\n", "The following example demonstrates how to load a local CSV file into a new table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Introduction to loading data from Cloud Storage](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage) in the BigQuery documentation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!bq \\\n", " --location=US \\\n", " load \\\n", " --autodetect \\\n", " --skip_leading_rows=1 \\\n", " --source_format=CSV \\\n", " {dataset_id}.us_states_gcs \\\n", " 'gs://cloud-samples-data/bigquery/us-states/us-states.csv'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run a query\n", "\n", "The BigQuery command-line tool has a `query` command for running queries, but it is recommended to use the [magic command](./BigQuery%20Query%20Magic.ipynb) for this purpose." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cleaning Up\n", "\n", "The following code deletes the dataset created for this tutorial, including all tables in the dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!bq rm -r -f --dataset $dataset_id" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }